About STL-10Inspired by the CIFAR-10 dataset, STL-10 is a dataset containing a combination of images (gathered from ImageNet) of animals and transportation objects. Within the dataset there are 6 animal & 4 transportation object classes:
The dataset contains 3 folders that will be used at specified times:
Aside from having not having identical classes, another difference between the datasets, is that the images in STL-10 are 3x's the resolution of CIFAR-10's images (96x96 versus 32x32).
STL-10 is specifically an image recognition dataset. The dataset is intended to be used for developing unsupervised feature learning, deep learning, self-taught algorithms. That being said, the primary prediction task is to determine the type of animal or transportation object found in each of the pictures in the Unlabeled folder. Something that should be noted about the "Unlabeled" folder, aside from it containing the the classes mentioned above, it additionally includes other types of animals (bears, rabbits, etc.) and transportation objects (trains, buses, etc.).
Measure of SuccessOne reason this data is important is if trained correctly & the prediction task is achieved, third parties that use image captcha's for their websites, networks, etc. could use this data as a way to visualize how captcha's can be bypassed by unsupervised feature learning, which essentially defeats the purpose of having a captcha test.
In order for this data to be of use to third parties using captcha's, I believe the prediction algorithm will have to render at least an 80% accuracy. The reason it isn't 90% is because if the prediction algorithm selects a wrong image, or doesn't recognize an image, often times captcha test's will let you get away with about 2 or less errors.
Dataset : STL-10 Kaggle Dataset
Question Of Interest : Identify the type of animal or transportation object shown in the picture
import glob
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from PIL import Image
from tkinter import Tcl
from sklearn import preprocessing
%matplotlib inline
le = preprocessing.LabelEncoder()
# Load list of labels/classes
with open('../data/stl10_binary/class_names.txt', 'r') as f:
labels = [line.rstrip() for line in f]
print("=========== Classes ===========")
for i in range(len(labels)): print(i,":", labels[i])
=========== Classes =========== 0 : airplane 1 : bird 2 : car 3 : cat 4 : deer 5 : dog 6 : horse 7 : monkey 8 : ship 9 : truck
# Training Dataset #########################################################################
""" Reading in the TRAINING dataset (sorted by filename) into a numpy array """
file_train = list(Tcl().call('lsort', '-dict', glob.glob('../data/STL10/Train/*.png')))
rgb_train_matrix = np.array([np.array(Image.open(file)) for file in file_train])
_, h, w, c = rgb_train_matrix.shape
""" Gather label's indices # (Training | y)"""
with open('../data/stl10_binary/train_y.bin', 'rb') as f:
yTrain_temp = np.fromfile(f, dtype=np.uint8)
le.fit(yTrain_temp)
y_train = le.transform(yTrain_temp)
""" Create new numpy array w/ re-colored greyscaled images | TRAINING """
greyscale_train_matrix = np.array([np.array(Image.open(file).convert("L")) for file in file_train])
print
df_train = pd.DataFrame({'' : ['# of Samples','# of Features','Image Resolution','# of Channels','Image Size']})
df_train['Original Data'] = [_, h*w*c, '{} x {}'.format(h,w), c, str(h*w*c) + 'px\'s']
df_train['Greyscale Data'] = [_, h*w, '{} x {}'.format(h,w), 1, str(h*w) + 'px\'s']
rgb_train_vec = rgb_train_matrix.reshape((_,h*w*c))
greyscale_train_vec = greyscale_train_matrix.reshape((_,h*w))
X_train = greyscale_train_vec
print("Training Data")
df_train
Training Data
| Original Data | Greyscale Data | ||
|---|---|---|---|
| 0 | # of Samples | 5000 | 5000 |
| 1 | # of Features | 27648 | 9216 |
| 2 | Image Resolution | 96 x 96 | 96 x 96 |
| 3 | # of Channels | 3 | 1 |
| 4 | Image Size | 27648px's | 9216px's |
# Testing Dataset #########################################################################
""" Reading in the TESTING dataset (sorted by filename) into a numpy array """
file_test = list(Tcl().call('lsort', '-dict', glob.glob('../data/STL10/Test/*.png')))
rgb_test_matrix = np.array([np.array(Image.open(file)) for file in file_test])
_, h, w, c = rgb_test_matrix.shape
""" Gather label's indices # (Testing | y)"""
with open('../data/stl10_binary/test_y.bin', 'rb') as f:
yTest_temp = np.fromfile(f, dtype=np.uint8)
le.fit(yTest_temp)
y_test = le.transform(yTest_temp)
""" Create new numpy array w/ re-colored greyscaled images | TESTING """
greyscale_test_matrix = np.array([np.array(Image.open(file).convert("L")) for file in file_test])
df_test = pd.DataFrame({'' : ['# of Samples','# of Features','Image Resolution','# of Channels','Image Size']})
df_test['Original Data'] = [_, h*w*c, '{} x {}'.format(h,w), c, str(h*w*c) + 'px\'s']
df_test['Greyscale Data'] = [_, h*w, '{} x {}'.format(h,w), 1, str(h*w) + 'px\'s']
rgb_test_vec = rgb_test_matrix.reshape((_,h*w*c))
greyscale_test_vec = greyscale_test_matrix.reshape((_,h*w))
X_test = greyscale_test_vec
print("Test Data")
df_test
Test Data
| Original Data | Greyscale Data | ||
|---|---|---|---|
| 0 | # of Samples | 8000 | 8000 |
| 1 | # of Features | 27648 | 9216 |
| 2 | Image Resolution | 96 x 96 | 96 x 96 |
| 3 | # of Channels | 3 | 1 |
| 4 | Image Size | 27648px's | 9216px's |
print("\n[Train] Original Matrix Shape : Before", rgb_train_matrix.shape, "-----> After", rgb_train_vec.shape)
print("[Train] Greyscale Matrix Shape : Before", greyscale_train_matrix.shape, " -----> After",X_train.shape)
print()
print("\n[Test] Original Matrix Shape : Before", rgb_test_matrix.shape, "-----> After", rgb_test_vec.shape)
print("[Test] Greyscale Matrix Shape : Before", greyscale_test_matrix.shape, " -----> After", X_test.shape)
[Train] Original Matrix Shape : Before (5000, 96, 96, 3) -----> After (5000, 27648) [Train] Greyscale Matrix Shape : Before (5000, 96, 96) -----> After (5000, 9216) [Test] Original Matrix Shape : Before (8000, 96, 96, 3) -----> After (8000, 27648) [Test] Greyscale Matrix Shape : Before (8000, 96, 96) -----> After (8000, 9216)
We begin by reading the dataset's (train & test) into numpy array's, but because they contain colored images, it would be optimal to turn these array's into only containing grayscale values so we are able to compute faster.
Then after doing so, the shape's of the original matrices and grayscale's are outputted to display the initial dimensions. To the right of these shapes, are the concatenated versions of those matrices.
The output of cell 10 is created to better understand the differences between the 4 matrices, two containing Original color pictures, and the other two containing Greyscaled colored images. Here is where we notice the large distance between the image sizes for each matrix. Notice how each Greyscaled picture is 3 times smaller than the Original color pictures.
import glob
import numpy as np
import pandas as pd
from PIL import Image
from tkinter import Tcl
import matplotlib.pyplot as plt
%matplotlib inline
with open('../data/stl10_binary/train_y.bin', 'rb') as f:
labels = np.fromfile(f, dtype=np.uint8)
with open('../data/stl10_binary/class_names.txt', 'r') as f:
classNames = [line.rstrip() for line in f]
# Reading in the dataset (sorted by filename) into a numpy array
file_X = list(Tcl().call('lsort', '-dict', glob.glob('../data/STL10/Train/*.png')))
rgb_matrix = np.array([np.array(Image.open(file)) for file in file_X])
_, h, w, c = rgb_matrix.shape
# Create new numpy array w/ re-colored greyscaled images
greyscale_matrix = np.array([np.array(Image.open(file).convert("L")) for file in file_X])
df = pd.DataFrame({'' : ['# of Samples','# of Features','Image Resolution','# of Channels','Image Size']})
df['Original Data'] = [_, h*w*c, '{} x {}'.format(h,w), c, str(h*w*c) + 'px\'s']
df['Greyscale Data'] = [_, h*w, '{} x {}'.format(h,w), 1, str(h*w) + 'px\'s']
rgb_vec = rgb_matrix.reshape((_,h*w*c))
greyscale_vec = greyscale_matrix.reshape((_,h*w))
print("\nOriginal Matrix Shape : Before", rgb_matrix.shape, "-----> After", rgb_vec.shape,"\n")
print("Greyscale Matrix Shape : Before", greyscale_matrix.shape, " -----> After",greyscale_vec.shape, "\n\n")
df
Original Matrix Shape : Before (5000, 96, 96, 3) -----> After (5000, 27648) Greyscale Matrix Shape : Before (5000, 96, 96) -----> After (5000, 9216)
| Original Data | Greyscale Data | ||
|---|---|---|---|
| 0 | # of Samples | 5000 | 5000 |
| 1 | # of Features | 27648 | 9216 |
| 2 | Image Resolution | 96 x 96 | 96 x 96 |
| 3 | # of Channels | 3 | 1 |
| 4 | Image Size | 27648px's | 9216px's |
We begin by reading the dataset into a numpy array, but because it contains colored images, it would be optimal to turn it into a grayscale array so we are able to compute faster.
Then after doing so, the shape of the original matrix and grayscale are outputted to show the initial dimensions. To the right of these shapes, are the concatenated versions of those matrices.
At the bottom a table is created to better understand the differences between the 2 matrices, one containing color pictures and the other one containing greyscaled images. Here, is where we notice the large distance between the image sizes for each matrix. Notice how each greyscaled picture is 3 times smaller than the original color pictures.
plt.style.use('ggplot')
def plot_gallery(images, titles, h, w, flag, n_row=3, n_col=6):
plt.figure(figsize=(1.7 * n_col, 2.3 * n_row))
plt.subplots_adjust(bottom=0, left=.01, right=.99, top=1.0, hspace=.25)
for i in range(n_row * n_col):
plt.subplot(n_row, n_col, i + 1)
plt.imshow(images[i].reshape((h,w)), cmap=plt.cm.gray)
if flag:
plt.title(classNames[titles[i]], size=12)
if not flag:
plt.title(titles[i], size=12)
plt.xticks(()); plt.yticks(())
plot_gallery(X_train, y_train, h, w, True)
Here we visualize 18 images within the greyscale numpy array. This function will be helpful later on to output certain images given a certain certain array.
from sklearn.decomposition import PCA
# We use the function plot_explained_variance to help us visualize the explained variance
def plot_explained_variance(pca):
import plotly
from plotly.graph_objs import Bar, Line
from plotly.graph_objs import Scatter, Layout
from plotly.graph_objs.scatter import Marker
from plotly.graph_objs.layout import XAxis, YAxis
plotly.offline.init_notebook_mode() # run at the start of every notebook
explained_var = pca.explained_variance_ratio_ * 100
cum_var_exp = np.cumsum(explained_var)
plotly.offline.iplot({
"data": [Bar(y=explained_var, name='Individual explained variance'),
Scatter(y=cum_var_exp, name='Cumulative explained variance')],
"layout": Layout(xaxis=XAxis(title='Principal components'), yaxis=YAxis(title='Variance Explained (%)'))
})
# This function will help us later on to reconstruct the PCA/RPCA into an image
def reconstruct_image(trans_obj, org_features):
low_rep = trans_obj.transform(org_features)
rec_image = trans_obj.inverse_transform(low_rep)
return low_rep, rec_image
n_components = 350
print("Extracting the top %d eigenObjects from %d Objects" % (n_components, X_train.shape[0]))
pca = PCA(n_components=n_components)
%time pca.fit(X_train.copy())
eigenObjects = pca.components_.reshape((n_components,h,w))
plot_explained_variance(pca)
Extracting the top 350 eigenObjects from 5000 Objects CPU times: user 16.4 s, sys: 136 ms, total: 16.5 s Wall time: 4.28 s
eigenObject_titles = ["eigenObject %d" % i for i in range(eigenObjects.shape[0])]
plot_gallery(eigenObjects, eigenObject_titles, h, w, False)
c = pic = 0
plt.figure(figsize=(20, 5))
plt.subplots_adjust(bottom=0, left=.01, right=.99, top=1.0, hspace=.25)
for i in range (0, 8):
eigenObjects_idx = X_train[i]
low_dim_rep_pca, reconstruct_img_pca = reconstruct_image(pca, eigenObjects_idx.reshape(1,-1))
obj = classNames[labels[pic] - 1]
plt.subplot(1,6,c+1)
plt.imshow(eigenObjects_idx.reshape((h,w)), cmap=plt.cm.gray)
plt.title('Original ' + obj)
plt.grid(False)
plt.xticks(()); plt.yticks(())
plt.subplot(1,6,c+2)
plt.imshow(reconstruct_img_pca.reshape((h,w)), cmap=plt.cm.gray)
plt.title('Reconstructed ' + obj +' from Full PCA')
plt.grid(False)
plt.xticks(()); plt.yticks(())
c += 2; pic += 1
if c == 4:
if i != 7:
plt.figure(figsize=(20,5))
plt.subplots_adjust(bottom=0, left=.01, right=.99, top=1.0, hspace=.25)
c = 0
n_components = 350
print("Extracting the top %d eigenObjects from %d Objects" % (n_components, X_train.shape[0]))
rpca = PCA(n_components=n_components, svd_solver='randomized')
%time rpca.fit(X_train.copy())
eigenObjects = rpca.components_.reshape((n_components,h,w))
eigenObject_titles = ["eigenObject %d" % i for i in range(eigenObjects.shape[0])]
plot_gallery(eigenObjects, eigenObject_titles, h, w, False)
Extracting the top 350 eigenObjects from 5000 Objects CPU times: user 16.8 s, sys: 160 ms, total: 17 s Wall time: 4.43 s
c = pic = 0
plt.figure(figsize=(20, 5))
plt.subplots_adjust(bottom=0, left=.01, right=.99, top=1.0, hspace=.25)
for i in range (0, 8):
eigenObjects_idx = X_train[i]
low_dim_rep_rpca, reconstruct_img_rpca = reconstruct_image(rpca, eigenObjects_idx.reshape(1,-1))
obj = classNames[labels[pic] - 1]
plt.subplot(1,6,c+1)
plt.imshow(eigenObjects_idx.reshape((h,w)), cmap=plt.cm.gray)
plt.title('Original ' + obj)
plt.grid(False)
plt.xticks(()); plt.yticks(())
plt.subplot(1,6,c+2)
plt.imshow(reconstruct_img_rpca.reshape((h,w)), cmap=plt.cm.gray)
plt.title('Reconstructed ' + obj +' from Full Random PCA')
plt.grid(False)
plt.xticks(()); plt.yticks(())
c += 2; pic += 1
if c == 4:
if i != 7:
plt.figure(figsize=(20,5))
plt.subplots_adjust(bottom=0, left=.01, right=.99, top=1.0, hspace=.25)
c = 0
from sklearn.decomposition import PCA
pca = PCA(n_components=400)
%time pca.fit(X_train)
X_train_pca = pca.transform(X_train)
pca_proj = pca.inverse_transform(X_train_pca)
loss = ((X_train - pca_proj) ** 2).mean()
print(loss)
rpca = PCA(n_components=400, svd_solver='randomized')
%time rpca.fit(X_train)
X_train_rpca = rpca.transform(X_train)
rpca_proj = rpca.inverse_transform(X_train_rpca)
loss = ((X_train - rpca_proj) ** 2).mean()
print(loss)
CPU times: user 20.1 s, sys: 208 ms, total: 20.3 s Wall time: 5.21 s 350.60294825078336 CPU times: user 20 s, sys: 140 ms, total: 20.2 s Wall time: 5.19 s 350.607027749715
%time
from skimage.io import imshow
from skimage.filters import sobel_h, sobel_v
c = pic = 0
plt.figure(figsize=(20, 5))
plt.subplots_adjust(bottom=0, left=.01, right=.99, top=1.0, hspace=.25)
for i in range (0, 8):
obj = classNames[labels[pic] - 1]
plt.subplot(1,6,c+1)
plt.imshow(greyscale_vec[i].reshape((h,w)), cmap=plt.cm.gray)
plt.title('Original ' + obj)
plt.grid(False)
plt.xticks(()); plt.yticks(())
plt.subplot(1,6,c+2)
gradient_mag = np.sqrt(sobel_v(greyscale_vec[i].reshape(1,-1))**2 + sobel_h(greyscale_vec[i].reshape(1,-1))**2)
plt.imshow(gradient_mag.reshape((h,w)), cmap=plt.cm.gray)
plt.title('Gradient ' + obj)
plt.grid(False)
plt.xticks(()); plt.yticks(())
c += 2; pic += 1
if c == 4:
if i != 7:
plt.figure(figsize=(20,5))
plt.subplots_adjust(bottom=0, left=.01, right=.99, top=1.0, hspace=.25)
c = 0
CPU times: user 2 µs, sys: 0 ns, total: 2 µs Wall time: 4.29 µs
from skimage.feature import daisy
idx = int(np.random.rand(1)*len(greyscale_vec))
img = greyscale_vec[idx].reshape((h,w))
features, img_desc = daisy(img, step=20, radius=20, rings=2, histograms=8, orientations=8, visualize=True)
imshow(img_desc)
plt.grid(False)
from sklearn.metrics.pairwise import pairwise_distances
def apply_daisy(row,shape):
features = daisy(row.reshape(shape), step=20, radius=20, rings=2, histograms=8, orientations=4, visualize=False)
return features.reshape((-1))
# find the pairwise distance between all the different image features
daisy_features = np.apply_along_axis(apply_daisy, 1, greyscale_vec, (h,w))
dist_matrix = pairwise_distances(daisy_features)
%time
import copy
c = pic = 0
plt.figure(figsize=(20, 5))
plt.subplots_adjust(bottom=0, left=.01, right=.99, top=1.0, hspace=.25)
for i in range (0, 8):
distances = copy.deepcopy(dist_matrix[i,:])
distances[i] = np.infty
j = np.argmin(distances)
obj = classNames[labels[pic] - 1]
plt.subplot(1,6,c+1)
plt.imshow(greyscale_vec[i].reshape((h,w)), cmap=plt.cm.gray)
plt.title('Original Image [' + str(i + 1) + ']')
plt.grid(False)
plt.xticks(()); plt.yticks(())
plt.subplot(1,6,c+2)
plt.imshow(greyscale_vec[j].reshape((h,w)), cmap=plt.cm.gray)
plt.title('Closest Image [' + str(i + 1) + ']')
plt.grid(False)
plt.xticks(()); plt.yticks(())
c += 2; pic += 1
if c == 4:
if i != 7:
plt.figure(figsize=(20,5))
plt.subplots_adjust(bottom=0, left=.01, right=.99, top=1.0, hspace=.25)
c = 0
CPU times: user 2 µs, sys: 0 ns, total: 2 µs Wall time: 4.29 µs
from sklearn.metrics import accuracy_score
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
knn_pca = KNeighborsClassifier(n_neighbors=1)
knn_dsy = KNeighborsClassifier(n_neighbors=1)
pca_train, pca_test, dsy_train, dsy_test, y_train, y_test = train_test_split(X_train_pca, daisy_features, labels, test_size=0.2, train_size=0.8)
knn_pca.fit(pca_train,y_train)
acc_pca = accuracy_score(knn_pca.predict(pca_test),y_test)
knn_dsy.fit(dsy_train,y_train)
acc_dsy = accuracy_score(knn_dsy.predict(dsy_test),y_test)
print(f"PCA accuracy:{100*acc_pca:.2f}%, Daisy Accuracy:{100*acc_dsy:.2f}%".format())
PCA accuracy:26.00%, Daisy Accuracy:32.90%
Kaggle. STL-10. https://www.kaggle.com/jessicali9530/stl10 (Accessed 9-25-2020)
Adam Coates, Honglak Lee, Andrew Y. Ng An Analysis of Single Layer Networks in Unsupervised Feature Learning AISTATS, 2011.